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ESTIMATION  OF  VARIANCE  OF  THE  RATIO  ESTIMATOR 

* 

Chien-Fu  Wu 

Technical  Summary  Report  #2267 
August  1 981 
ABSTRACT 

A  general  class  of  estimators  of  the  variance  of  the  ratio 


estimator  is  considered,  which  includes  two  standard  estimators  v  and 
and  approximates  another  estimator  suggested  by  Royall  and 

Eberhardt  (1 975)  .^Asymptotic  expansions  for  the  variances  and  biases  of 
the  proposed  estimators  are  obtained.  Based  on  this  we  obta-i-ft  optimal 
variance  estimator  in  the  class  and  compared  the  relative  merits  of  three 
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estimators  v  ,  v  and  v  without  any  model  assumption.  Under  a 
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simple  regression  model  a  more  definite  comparison  of  vg,  v  and 
is  made  in  terms  of  variance  and  bias . 
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SIGNIFICANCE  AND  EXPLANATION 


In  estimating  the  population  mean  of  a  character  y,  we  often  make 
use  of  an  auxiliary  oovariate  x  whose  information  is  more  readily 
available  and  is  positively  correlated  with  y.  One  commonly  used 
estimator  in  survey  sampling  is  the  ratio  estimator  (y-sample  mean) (x- 
population  mean ) / ( x-samp le  mean) .  To  assess  the  variability  of  the 
estimator,  we  need  an  estimator  of  its  variance*  Several  variance 
estimators  have  been  compared  under  the  assumption  that  the  finite 
population  itself  is  a  random  sample  from  an  infinite  superpopulation 
that  is  described  by  a  linear  model .  Such  an  assumption  may  not  be 
realistic  in  practice  and  usually  is  hard  to  verify.  He  propose  a  class 
of  variance  estimators,  which  includes  or  approximates  several  existing 
variance  estimators  in  the  literature.  We  then  find  the  asymptotic 
variance  and  bias  of  these  estimators  and  determine  the  optimal 
estimators  for  minimizing  variance  or  bias .  No  superpopulation  model  is 
assumed.  If  we  do  assume  a  regression  model  over  the  finite  population, 
strong  optimality  results  are  obtained  and  more  definite  comparisons  of 
estimators  are  made. 


The  responsibility  for  the  .wording  and  views  expressed  in  this 
descriptive  summary  lies  with  MRC,  and  not  with  the  author  of  this 
report . 


ESTIMATION  OF  VARIANCE  OF  THE  RATIO  ESTIMATOR 


Chien-Fu  Wu 
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1 .  Introduction 

Suppose  a  population  consists  of  N  distinct  units  with  values 
^yi'xi^'  x^  >  0,  i«1,...,N.  Denote  the  population  means  of  y^  and 
by  Y  and  X.  To  estimate  Y,  it  is  customary  to  take  a  simple  random 

A 

sample  of  size  n  and  use  the  ratio  estimator  Y  »  y  X/x,  where  y 

1C 

and  x  are  respectively  the  sample  means  of  y^  and  x^ .  The  mean 

A 

square  error  and  variance  of  Y  are  each  approximated  by  (Cochran, 

R 

1977,  p.  155) 


V 


1-f  1 

n  N-1 


l  (y,  -  Z  x  )' 

i=1  X 


(1) 


where  f  *  n/N  is  the  sampling  fraction.  Two  commonly  used  estimators 
of  V  are 


1-f  1  c  , 

—  Jo  J,  ,!ri  -  rxl> 


(2) 


and 


i-f  .x.z  1  r  , 

■T  <=>  JIT  ‘  ‘h  '  rxi> 

X  i“1 


(3) 


where  r  *  y/x.  The  asymptotic  consistency  of  vq  and  and  the 

4 

asymptotic  normality  of  Y  were  rigorously  established  in  Soott  and  HU 

iC 

—2 

(1981).  Although  the  original  motivation  for  »  v^/x  as  a  variance 
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estimator  of  the  ratio  R  *  Y/X  is  the  unavailability  of  X ,  it  is  not 
clear  whether  is  worse  than  (Cochran,  1977,  p.  155).  Rao  and 

Rao  (1971)  studied  the  small-sample  properties  of  and  v2  by 
assuming  that  the  sample  is  a  random  sample  directly  from  an  infinite 
superpopulation  that  can  be  described  by  a  simple  linear  regression 
model  of  y  on  x  and  that  x  has  a  gamma  distribution.  The  nature 
of  random  sampling  from  a  finite  population  is  not  taken  into  full 
account  in  their  formulation  as  distinguished  from  the  usual 
superpopulation  approach.  Royall  and  Eberhardt  (1975)  noted  the  bias  of 
Vq  when  the  finite  population  is  a  random  sample  from  a 
superpopulation  in  which 


0Xi  +  €i 


(4) 


J2  t 


where  are  independent  with  mean  zero  and  variance  o  x^.  They 


suggested  the  simple  modification 


x  X  C 

vh  =  voT(1  -r>"  ' 

X 


(5) 


where  C„  is  the  x-s ample  coefficient  of  variation  and  x  is  the  mean 
X  c 


of  the  N-n  units  not  in  the  sample.  They  also  noted  that 


for  large  n  and  N  >>  n,  thus  justifying  the  use  of  from  a 


superpopulation  viewpoint.  The  estimator  \>2  was  previously 


recommended  by  Hajek  (1958).  The  estimator  v  is  £ -unbiased  under 

H 


model  (4)  with  t  *  1,  and  remains  approximately  € -unbiased  when  the 
variance  of  in  model  (4)  is  not  proportional  to  x^ .  An  empirical 


study  of  v  was  reported  in  Royall  and  Cumberland  (1978).  All  these 
H 


authors  assume  that  the  actual  population  satisfies  a  hypothetical 
infinite  population  model.  It  is  desirable  to  have  a  model-free 


.  .  ’  •  "  - 
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comparison  of  these  estimators.  As  Royall  and  Cumberland  (1978,  p.  334) 
pointed  out,  the  conventional  theory  in  sample  surveys  does  not  provide 
any  comparison  on  the  relative  merits  of  and  v^.  One  of  the 

purposes  of  this  paper  is  to  provide  such  a  model-free  comparison  of 

v  and  v,  . 

0  £ 

We  consider  a  more  general  class  of  variance  estimators 


Vq  '  <=>9  V0  ' 


which  includes  v  -  v  and  a  new  estimator 
0  2 


,,  1— r  X  l  r  , 

vi  -  —  ,!,i  -  rV 

x  i”1 


which  is  equal  to  (v  v  )'2  .  since  the  numerator  of  v  is  a  linear 

U  £  H 

combination  of  and  ,  it  can  be  adequately  approximated  by  some 

v^.  in  §2  we  obtain  the  leading  terms  of  the  mean  square  error  and 
variance  of  v  ,  which  happens  to  be  a  quadratic  function  in  g.  The 
optimal  variance  estimator  is  then  obtained  by  minimizing  this  quadratic 


function .  The  optimal  g  denoted  gQpt  is  equal  to  the  population 

regression  coefficient  of  z^/Z  over  x  /X,  i  “  1,...,N,  where  z^, 

2 

defined  in  (12),  depends  on  the  "residual"  ei  *  Yi  ”  Rx^  and  e^ . 
Therefore  among  v^,  and  v^,  is  the  best  if  g  ^  *  0.5» 
the  best  if  0.5  <  g  <  1.5;  and  v  the  best  if  g  >1.5.  By 

further  assuming  the  superpopulation  model  (4)  we  show  the  optimality  of 


vq  among  under  t  *  0  and  the  optimality  of  among 

under  t  *  1 .  Note  that  the  ratio  estimator  Y_  is  the  best  linear 
unbiased  estimator  of  Y  under  model  (4)  with  t  *  1  (Brewer,  1963} 


Royall,  1970).  If  the  ratio  estimator  is  adopted  with  this  optimality 
property  in  mind,  then  according  to  our  result,  one  ought  to  use  as 
the  estimate  of  variance.  Under  model  (4)  with  t  >  1,  it  is  also 


shown  in  $2  that  g  ^  >  1  with  the  implication  that  and  are 

better  than  .  Therefore  our  study  justifies  the  use  of  v  and 

in  practice  when  one  believes  that  model  (4)  with  t  >  1  adequately 

describes  the  population.  Under  further  distributional  assumptions  on 

x,  g^pt  is  determined  as  a  function  of  t.  In  $3  we  obtain  the  leading 

terms  of  the  bias  of  v  ,  m  then  compare  v„,  v  and  v  in  terms  of 

g  0  1  2 

their  biases  under  model  (4)  with  general  varianoe  pattern  t.  By 
further  assuming  that  x  has  a  gamma  distribution,  we  show  that,  among 
vq,  and  v2<  Vg  is  the  least  biased  for  t  >  1 .5  or  t  <  0.6,  v 
the  least  biased  for  0.6  <  t  <  6/7  and  the  least  biased  for 

6/7  <  t  <  1 .5 . 


(6) 

(7) 

(8) 


(x  -  X) (y  -  7)  (7  -  Z)  -  0(n-2)  ,  (9) 

where  <5x  -  (x-X)/X,  s  is  the  sample  mean  of  character  z  from  the 
same  simple  random  sample  as  x  and  y,  Z  is  the  corresponding 
population  mean,  ofn”1)  is  of  stochastic  order  n_1 ,  e  ■ 
n_1{e^  +...+  en)  and  e^  ■  y^  -  Rxi  the  residual  of  y^  to  the 


line  connecting  (X,Y)  and  the  origin.  Note  e1  +  ..,+  m  0. 

Formulas  (6)  and  (7)  follow  easily  from  (8),  and  formulas  (8),  (9)  can 
be  rigorously  justified  as  in  David  and  Sukhatme  (1974).  Using  (6),  (8) 
and  (9),  we  expand 


-N 


-  .  n 


1-f  ,1  V  2  _  ~1*Vi  -_e.lt  It 

HU  {n  ]  ei  -  2  ~ ZiT~*  e  -  2  I  (H  l  Xiei  -  N  t  xiei) 

l  ■*  ^  X  1  1 


(10) 


¥2+0(J2>  ■ 

n 


(11) 


where 


-1 

z 


tN 

-i  r  2  Viei 

n  \  zi  '  zi  "  ei  ‘  2  -J*—  ei 
1  Eixi 


(12) 


Note  that  Z  *  N_1(e2  +...+  e2)  and  E  VQ  =  V  +  0(n~2) .  From  (7),  (8) 


and  (11), 


V  =  G  -  g( fix) z}  +  0(— Tjj 

9  n 


(13) 


and  the  mean  square  error  and  variance  of  v  are 

9 

—  — 2 

var(v  )  -  (— 1-) 3{s2-2g  -5  S  +  g2  S2}  +  0(-^)  ,  (14) 

g  n  z  —  zx  —2  x  4 

2  a  a  n 

where  S  and  S_„  are  the  population  variance  of  z  and  the 
2  *** 

population  covariance  of  z  and  x,  respectively.  The  bias  square  of 

v  is  of  lower  order  than  the  variance  and  will  be  studied  in  $3. 

9 


2.2  Optimal  choice  of  and  comparison  of  v0»vj*V2 

The  optimal  variance  estimator  is  now  obtained  by  minimizing 


expression  (14)  with  respect  to  g,  the  optimal  g  denoted  g 


opt 


being 


-5- 
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(15) 


S  /X  Z 
xz 

*'»*'  s2/*2 

X 

which  is  the  population  regression  coefficient  of  z^/z  over  x^/X, 


i  -  1 


,N .  Therefore,  if  n  is  large  and  computational  cost  is  not  a 


problem,  we  propose  the  optimal  estimator  v*f  where  g  is  a  sample 
analogue  of  90pt*  For  estimation  of  the  population  mean  Srivastava 
(1967)  suggested  a  similar  estimator  y(X/x)9.  Das  and  Tripathi  (1978) 
considered  estimators  of  a  similar  type  for  the  finite  population 
variance  of  y.  In  both  papers  the  optimal  g  for  minimizing  the 
asymptotic  mean  square  errors  were  found. 

In  practice  we  may  not  want  to  compute  g  and  will  choose  the 


variance  estimate  among  vq,  and  .  since  the  leading  terms  in 

var(v^)  are  quadratic  in  g,  we  conclude  that,  among  the  three,  vq 

is  the  best  if  g  .  >  0.5,  v  the  best  if  0.5  <  g  .  <  1 .5  and  v 

’opt  1  ’opt  2 

the  best  if  g  .  >  1.5.  TO  further  relate  our  estimator  v  to  the 
opt  g 

usual  ratio  estimator  and  the  more  general  estimator  proposed  by 


Srivastava  (1967),  we  approximate  y^  -  rx^  by  y^  -  RX^  *  and  the 

variance  estimation  problem  is  now  reduced  to  estimating  the  population 

—  -i  2  2  -122 

mean  D  *  N  (e,  +...+  e„)  by  the  sample  mean  v  »  n  (e.  +...+  e  ), 
in  o  i  n 

or  by  the  ratio  estimator  v  =  v^x/x,  or  by  the  unfamiliar 

—  —  2 

-  vq(X/x)  .  The  usual  comparison  of  the  sample  mean  and  the  ratio 
estimator  (Cochran,  1977,  §6.6)  and  the  more  general  comparison  in 


Srivastava  (1967)  would  suggest  that  vq  is  less  efficient  than 

and  \>2  if  the  population  regression  coefficient  of  e^/D  over  x^/X 

is  greater  than  V2  •  Because  of  the  error  introduced  in  the 

approximation  y^  -  rx^  “  e^,  the  exact  condition  (15)  involves  the 

2 

less  intuitive  zt  rather  than  ei .  Some  readers  may  prefer  the 
preceding  interpretation  in  terms  of  the  regression  of  residual  square 


-6- 


Under  model  (4)  gs. 


where  S 


is  the 


e^  over  x^, 


xz 

2 


gs 


xe“  xe* 

copulation  covariance  of  xA  and  e‘ ,  so  that  z^  can  indeed  be 
2 

replaced  by  e^.  Here  g  denotes  expectation  with  respect  to  model. 

To  gain  further  insight,  we  now  assume  the  superpopulation  model 
(4)  with  variance  proportional  to  xfc .  For  most  of  the  computations 


involving  model  (4),  it  is  important  to  note  that,  under  (4), 

_  1/„ 

ei  =  yi  -  RXi  -  ei  40(N  )  , 


&  +  0<N 


-V2 


-It  t  —2 

if  N  (x  +  ••#+  x  )/X  is  bounded#  The  g  that  minimizes 
1  N 

£{var(v  )}  under  (4)  is,  up  to  a  term  of  order  N-1, 

9 


9*  = 


,rN  t+1  -  -N  t.,rN  , 

(£1  Xi  -  X  xi)(i1  Xi) 

ZtT  — ,2 . rN  t~ 

-  X)  (E1  xi> 


(16) 


We  have  g*  =»  0  for  t  =  0  and  g*  =  1  for  t  =  1 ,  which  is  stated 
as  a  proposition. 

Proposition  1 .  Under  model  (4)  with  t  *  0  (or  1),  (or  v^)  is 
the  optimal  estimator  of  V  among  v^. 

We  shall  point  out  that  VR  is  the  best  linear  unbiased  estimator 
of  Y  under  model  (4)  with  t  =  1  (Brewer,  1963r  Royall,  1970). 
Therefore  the  ratio-type  estimator  v  for  variance  should  be  used  in 

A 


situations  where  the  ratio  estimator  YR  is  optimal  for  estimating  the 

2 

mean.  On  the  other  hand,  t  «  0  implies  that  e^  and  x^  are  not 
correlated  and  the  optimal  variance  estimator  does  not  incorporate 

information  on  x.  For  t  >  1  we  have  £  x^+1  ^  x^  *  £  x^  £  x^  , 
which  implies  g^  >  1  for  t  *  1  •  Its  implication  as  to  the  choice  of 
estimators  is  states  as  follows. 

Proposition  2 .  Under  model  (4)  with  t  >  1,  the  optimal  g#  >  1  and 
vi ,  v2  are  b  th  better  'ian  for  estimating  V. 


When  t  y  0  or  1,  there  seems  to  be  no  clear-cut  comparison  of 
V  v  v  we  now  assume  a  distribution  on  x  to  facilitate  such  a 
comparison.  The  optimal  g  becomes 

Cov(xt,x)E(x) 

9** - z — -  /  (17) 

E(x  )  var(x) 

where  expectation  is  taken  with  respect  to  the  distribution  of  x. 

When  x  has  a  gamma  distribution  with  two  parameters,  g**  *  t 
irrespective  of  the  values  of  the  parameters  •  When  x  has  a  beta 
distribution  on  [0,M],  M  >  0,  with  parameters  p  and  q,  p,q  >  0, 

g**  -  t(p+q+1 )/(p+q+t) .  Note  that  g##  <  t  for  t  >  1  and  g##  >  t 

for  t  <  1.  In  particular,  when  x  has  a  uniform  distribution,  g**  = 
3t/(t+2).  When  x  has  a  lognormal  distribution  with  parameters  6  and 
Y,  i.e.,  Y  +  6  log  x  is  standard  normal,  then  g*„  *  (w2t-1 )/(w2-1 ) , 
where  w  »  exp( 1/26  > .  Note  that  g##  >  t  for  t  >  1  and  g##  <  t 

for  t  <  1  in  this  case.  Some  selected  g«*  values  as  function  of 

t  are  given  in  Table  1 . 


Table  1.  Optimal  g**  as  function  of  t  in  model  (4) 


values 

of 

t 

distribution  of  x 

0 

0.5 

1 .0 

1.5 

2.0 

2.5 

3.0 

gamma 

0 

0.5 

1 .0 

1 .5 

2.0 

2.5 

3.0 

uniform 

p+q-2 

0 

0.6 

1 .0 

1 .2  9 

1.5 

1.67 

1.8 

beta 

p-*q-5 

0 

0.55 

1.0 

1 .38 

1 .71 

2.0 

2.25 

lognormal 

6-2 

0 

0.47 

1.0 

1.60 

2.28 

3.06 

3.93 

lognormal 

5-1 

0 

0.38 

1 .0 

2.03 

3.72 

6.51 

11.11 

8- 


Rao  and  Rao  (1971)  compared  the  stability  of  vq  and  under  an 

infinite  population  regression  model  and  gamma  distribution  on  x.  They 
reported  that  is  more  stable  than  v  for  t  *  0  or  1  and 

is  more  stable  than  vq  for  t  ■  2.  Their  results  are  consistent  with 
ours . 


3 .  Bias  of  v 


3.1 .  Asymptotic  expansions 

Multiplying  (7)  to  (10)  and  using  (8)  and  (9)  to  collect  terms  of 


order  n-3,  we  obtain 


*1;  ?  Vi  ?  Vi* 


e(Sx)  1 


N 


X  "  1 


X  "  1 

N 

l 

1 


From  Theorem  2.3  of  Cochran  (1977),  the  bias  of  v  is 

9 

/.  .  .  2  ,  , _  N  _  N  _  N 

biaa(V  *  fen  ^  {fe~  }  ei  l  xi  +  2(9+1)(J  Vi' 

-  tatllteai  NX2  [  e2  -  (g+2 )  l  ,t  l  Xie2}  ♦  O(^)  . 


(18) 


Since  the  bias  square  of  v  is  of  0(n”4),  smaller  than  var(v  ), 

9  9 

one  would  not  choose  based  on  its  bias  for  large  samples .  But  in 

practice  the  variance  estimators  can  be  seriously  biased  in  small 
samples  (Rao,  1968).  The  bias  can  then  be  reduced  by  subtracting  the 
sample  analogue  of  expression  (18)  from  the  estimate. 


-9- 


g 


and  comparison  of  VQ , 

Without  further  assumptions  on  the  population,  there  is  no  clear- 
cut  aomparison  on  the  biases  of  and  v^.  By  assuming  model 

(4),  we  have 

.  2  .2  2 ,  N  .  N 

“  ZxM 

NX"  "  11 

N  N  N 

l  -  <g+2)  l  x  l 

i  i  i 


, ,  2  2  2  N  N  _ 

-  ixmxa.  f  ,t .  I  Xi  I  xt«,  +  0<JL, 


(19) 


Based  on  (19),  the  least  biased  for  t  m  0  or  1  are  easily 


found . 


Proposition  3 .  Under  model  (4)  with  t  «  0,  V  y  is  the  least  biased 
estimator  of  V  among  and  the  bias  is  of  order  0(n-^)  .  Under 

model  (4)  with  t  =  1,  v  and  V  are  the  least  biased  estimators 
of  V  among  v  and  are  the  only  ones  among  V  with  bias  of  order 

g  g 

0(n-3)  . 


For  t  ¥  0  or  1  expression  (19)  does  not  have  a  very  nice 

form.  To  gain  further  insight  we  assume  that  x  has  a  gamma 

distribution  with  shape  parameter  a  >  0.  Then,  from  .(19),  the  bias  of 
2 

V  is  c((g  +g+2)/2  -  (g+2)t),  where  C  is  a  positive  constant 
9 

independent  of  g.  From  this  we  conclude  that,  among  v^r  v^,  v^,  (i) 
vq  is  the  least  biased  for  t  >  1.5  or  t  <  0.6,  (ii)  the  least 
biased  for  0.6  <  t  <  6/7  and  (ili)  the  least  biased  for 
6/7  <  t  <  1.5. 


Rao  and  Rao  (1971)  compared  the  biases  of  VQ  and  under  an 

infinite  population  regression  model  and  gamma  distribution  on  x.  They 
reported  that  is  less  biased  than  for  0  <  t  <  1.5  and  vq 
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is  less  biased  than  for  t  *  2.  Their  results  are  again  in  good 

accord  with  ours . 

Of  oourse  the  definition  of  "bias"  here  is  in  terms  of  estimating 
the  approximate  variance  V  of  the  ratio  estimator,  not  its  true  mean 
square  error.  From  a  Monte  Carlo  study  by  Rao  (1968)  on  some  natural 
populations,  the  percent  underestimate  in  V  of  the  true  nean  square 
error  is  between  10  and  15  percent  for  n  -  4,6,8,12.  For  a  summary  of 
results  see  Cochran  (1977,  p.  164).  Therefore,  if  V  underestimates 
the  true  mean  square  error,  we  shall  prefer  a  variance  estimator  with  a 
small  amount  of  positive  "bias" .  The  only  impact  of  this  observation  on 
the  previous  comparison  is  for  0<t<1.0,  where  the  biases  of  v^, 
v  can  be  either  positive  or  negative.  Since  bias(v2>  >  biasfv^)  or 
bias  (vq)  for  0  <  t  <  1.0,  Vj  may  be  preferred  on  this  ground. 


The  research  was  supported  by  a  grant  from  the  National  Sclenoe 
Foundation.  The  comments  of  the  referee  are  appreciated. 
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